Biomedical Signal Processing and Control — Latest Matching Preprints

1

Bridging Acoustic and Semantic Spaces for Interpretable Voice Scoring via Zero-Shot Semantic Expansion

Hsiao, C.; Cheng, Y.-R.; Yang, C.-Y.; Hsu, F.-S.

2026-06-01 health informatics 10.64898/2026.05.29.26354442 medRxiv

Top 0.4%

0.9%

Show abstract

Subjective auditory-perceptual evaluation and uninterpretable deep learning models limit the clinical assessment of voice disorders. This study proposes a two-phase zero-shot framework to evaluate voice pathology. First, an Audio Spectrogram Transformer is fine-tuned on the Perceptual Voice Quality Database to generate an acoustic latent space. Second, Orthogonal Procrustes analysis maps these acoustic embeddings directly onto the semantic space of a pre-trained Sentence Transformer. The geometric alignment produced continuous semantic axes that outperformed a supervised machine learning baseline in regressing clinician-rated GRBAS (Grade, Roughness, Breathiness, Asthenia, and Strain) severity scales. Furthermore, these axes correlate with traditional acoustic measures, including Harmonics-to-Noise Ratio and local jitter, while remaining robust when applied to aperiodic signals by not requiring fundamental frequency extraction. Most importantly, the model achieved zero-shot semantic expansion, successfully evaluating voices using an untrained, natural clinical vocabulary beyond the GRBAS scale. External validation on the Voice ICarus Database confirmed cross-corpus stability and demonstrated the capacity for zero-shot differential phenotyping of specific etiologies, such as hypokinetic dysphonia and reflux laryngitis. By bridging acoustic and semantic latent spaces, this framework offers an objective, continuous, and transparent metric for evaluating voice quality using voice descriptive vocabulary.

2

SeGA-GNN: Semantically Gated Augmented Graph Neural Networks for Wearable-Based Emotion Detection

Kurt, F.; Subasi, S. N.; Yakisan, E. S.; Subasi, A.

2026-06-01 health informatics 10.64898/2026.05.29.26354434 medRxiv

Top 0.4%

0.8%

Show abstract

Background: Wearable technologies enable scalable and continuous monitoring of emotional states through passive sensing of physiological and behavioral signals. However, conventional learning approaches often struggle to model the complex temporal, contextual, and relational dependencies underlying human emotions. To address these limitations, we propose a graph-based framework that represents multimodal wearable observations as heterogeneous knowledge graphs enriched with semantic information derived from Large Language Models (LLMs), enabling richer contextual understanding beyond raw sensor measurements. Methods: We constructed a heterogeneous knowledge graph using multimodal Fitbit physiological signals and affective self-report data collected from 45 users. Framing mood prediction and emotion detection was formulated as both binary and ternary node classification tasks. We evaluated five baseline heterogeneous Graph Neural Network (GNN) architectures and compared them with the proposed Semantically Gated Augmented Graph Neural Network (SeGA-GNN) framework, which dynamically integrates LLM-generated semantic embeddings into graph representations through a gated cross-modal fusion mechanism. Results: The baseline GNN models achieved strong performance, with classification accuracies ranging from 0.7525 to 0.9739 for binary classification and 0.6249 to 0.9699 for ternary classification. The proposed SeGA framework consistently improved predictive performance across most architectures. In particular, semantic augmentation transformed the HAN model from moderate baseline performance into near-perfect emotion recognition capability, achieving SeGA-HAN Accuracy = 0.9988 and AUC = 1.0000 for binary classification and Accuracy = 0.9979 and AUC = 1.0000 for ternary classification. Discussion and Conclusion: Integrating LLM-derived semantic contextualization into heterogeneous graph learning enables effective modeling of contextual information that is not directly captured by wearable physiological signals alone. The proposed SeGA-GNN framework demonstrates that adaptive semantic fusion substantially improves the accuracy, robustness, and interpretability of wearable-based emotion detection. These findings establish a promising direction for next-generation wearable affective computing systems and intelligent emotion-aware applications.

3

PIE Toolbox: SSM-PCA Based Software for PET Diagnostic Pattern Analysis

Romanov, M.; Kireev, M.; Didur, M.; Cherednichenko, D.; Korotkov, A.; Valdes-Sosa, P.; Fan, Q.; Wang, Q.

2026-06-01 radiology and imaging 10.64898/2026.05.28.26354341 medRxiv

Top 0.7%

0.5%

Show abstract

One of the prominent methods in neuroimaging data processing is SSM-PCA, which is based on principal component analysis and allows for the identification of diagnostically significant patterns in the form of statistical maps. We developed software, PIE Toolbox, employs SSM-PCA and classification based on the obtained diagnostic patterns revealed from functional and structural tomographic brain imaging. The program supports the entire analysis pipeline including preprocessing of brain images, diagnostic patterns extraction, building classification models, and prediction based on them. The resulting diagnostic patterns are weighted principal components obtained through SSM-PCA, or their linear combinations. PIE Toolbox allows selection of relevant structural and functional brain patterns, computation of their expression values in regions of interest, classification using support vector machines, and evaluation of model performance via cross-validation. This approach enables the use of patterns as features of intergroup differences for individual diagnosis. The software has been validated on both simulated and ADNI datasets.

4

A Consensus-Driven Stacking Ensemble Framework for Interpretable Cardiovascular Risk Prediction and Clinical Deployment

Sozol, S. S.; Dev Nath, B. C.; Fahim, F. M. S.; Suzana, N. N.; Mirza, J. F.; Ahmmed, S.; Zohra, F.-T.; Zafr, A. H. A.; Uddin, M. N.; Mondal, M. R. H.; Hoque, A. S. M. L.

2026-05-26 health informatics 10.64898/2026.05.18.26352989 medRxiv

Top 0.8%

0.3%

Show abstract

Machine learning (ML) is being considered to help diagnose cardiovascular diseases (CVD). Still, challenges like inconsistent and limited datasets, limited infrastructure, and global inequalities lead to the need for a reliable and practicable ML solution. This paper presents an ML-driven framework for predicting CVD risk scores and classifying status. Several data preprocessing techniques, including multiple imputation by chained equations (MICE), outlier removal, are considered. In addition, hyperparameter tuning is performed with the GridSearchCV tuning technique. Moreover, a consensus-driven five-feature selection method is applied to identify optimal predictors. The dataset used in this study contains healthcare records related to future CVD risk scores, comprising 1,529 patient records with 22 features. The optimized stacked ensemble model is applied to the dataset and achieves a cross-validated coefficient of determination value of 98.13% for CVD risk score regression. Comparative evaluation with other ML models confirmed improved accuracy, efficiency, and interpretability. The explainable AI technique SHAP is applied to interpret predictions and highlight key risk factors. Moreover, a deployment-ready web platform with multi-role access has been developed that demonstrates clinical applicability. The proposed framework offers a reliable and interpretable tool for early detection of CVD and personalized risk assessment. In the future, this work can be extended to integrate longitudinal data, medical imaging, and deep learning to improve generalizability and strengthen real-world impact.

5

Wilson's Central Terminal Changes Location on the Body Surface During the P-Wave: Why Precordial Leads Might Not Be What We Think

Bender, J.; Stoks, J.; Barrios Espinosa, C.; Becker, S.; Cluitmans, M. J. M.; Loewe, A.

2026-05-28 cardiovascular medicine 10.64898/2026.05.20.26352966 medRxiv

Top 1.0%

0.3%

Show abstract

Background and Aims: Clinical interpretation of the precordial leads V1-V6 assumes that Wilson's central terminal (WCT) has a fixed anatomical location. Consequently, a positive signal corresponds to electrical activation spreading from WCT towards the respective electrode, and vice versa. However, the location of WCT has never been systematically investigated. Yet, a better understanding of WCT location could improve the interpretation of the precordial leads. This work aims to characterize the spatial expansion and location of the physical WCT i.e., the electrical potential defined by the WCT, during the P-wave on the body surface. Methods: An intensive analysis of body surface potential maps (BSPMs) during atrial depolarization in an in silico patient cohort and clinical data was conducted. Results: During the P-wave, the location of WCT was not stationary but the spatial extent and location varied across time as well as across individuals. Four distinct spatial patterns of WCT distribution on the body surface were identified in silico, and three of these were found in the clinical cohort. WCT signals agreed with BSPM signals at commonly assumed positions of WCT only for a small fraction of the P-wave. Conclusion: The spatial extension and location of WCT changes during the P-wave and thus should be considered when interpreting the precordial leads.

6

Identification of a Fractional Model for an Outbreak of the Dengue Fever

Cresson, J.; Pere, M.; Szafranska, A.

2026-05-27 epidemiology 10.64898/2026.05.26.26354120 medRxiv

Top 1%

0.2%

Show abstract

This work focuses on the global and partial identification problem for fractional differential equations. We provide a general numerical procedure based on global and local optimization algorithms with two refinements for biological systems that ensure solution positivity and homogeneous parameter units. The method is applied to a new fractional model of Dengue outbreak called the Fractional Homogeneous Nishiura (FHN) model, calibrated using data of newly infected people in Cape Verde. We show that our identification method yields a better fit between data and model solutions than previous approaches and that our FHN model captures the dynamics of Dengue more closely than existing systems.

7

Segmental Lung Sound Analysis in Obstructive Lung Diseases Using Electronic Stethoscope; a protocol to establish an acoustic repository

Anuradha, H.; Yasaratne, D.; GMRI, G.; Parakrama, E.; Severin, R.

2026-05-28 respiratory medicine 10.64898/2026.05.27.26354263 medRxiv

Top 1%

0.2%

Show abstract

Introduction Obstructive lung diseases (OLDs) are responsible for high rates of illness and death worldwide. Inflammation, chronic airflow limitation, and bronchial remodeling occur in OLD and eventually result in the unique respiratory sounds. Despite its subjective and having low reproducibility, still traditional auscultation using a manual stethoscope is the main method used to identify the lung sounds. Nevertheless, the combination of recent advancements in digital stethoscopes and AI (Artificial Intelligence) has permitted the objective measurement of lung sounds. Nevertheless, there is a lack of standardized, region-specific databases for AI training and validation. Even though lung sound classification is an emerging aspect in research and telerehabilitation the lobar wise acoustic pattern is still novel due to lack of prevailing database to train AI models. Identifying this gap this study aims to develop an acoustic repository and analyze the data using segmental lung sounds from patients with OLDs and healthy controls through an electronic stethoscope. Methods and analysis This is a cross sectional observational study involving 120 participants (60 OLD patients and 60 healthy controls). Lobar wise acoustic signals will be captured using an electronic stethoscope in healthy and diseases population. The data will be analyzed using Audacity software for annotations and then it will be used for feature extraction and statistical analysis. The acoustic features extracted through Audacity, will include frequency, intensity, pitch, and root mean square (RMS) energy. Repeated measures ANOVA will be applied to compare mean sound intensities across lung segments while Pearson correlation will be used to assess associations with body composition parameters. The data will then be standardized for AI-based diagnostic applications. Ethics and dissemination The study is being reviewed from the Ethics Review Committee, Faculty of Medicine, University of Peradeniya (2025/EC/87) will be sought. Informed consent will be obtained in writing. The dissemination of results will take place through peer-reviewed publications and the creation of a public database containing lung sounds from the region.

8

Deep learning optimisation for cardiology: Neural Architecture Search-driven arrhythmia classification with electrocardiograms

Vanegas Mueller, E.; Joe-Oshodi, A.; Banerjee, A.; Villarroel, M.

2026-05-30 cardiovascular medicine 10.64898/2026.05.28.26354348 medRxiv

Top 2%

0.0%

Show abstract

Cardiovascular disease is the leading cause of death worldwide. Sudden cardiac death (SCD) accounts for roughly 50% of all cardiac deaths. The electrocardiogram (ECG) is widely used for early diagnosis of cardiac disease. However, the complexity of accurate interpretation limits the ECG's efficacy. Modern deep learning methods have been applied to assist clinicians in diagnosis. We applied Neural Architecture Search (NAS), an automated machine learning technique, to identify optimal deep learning architectures for classifying cardiac arrhythmias from ECGs. We applied the Differentiable Architecture Search strategy to an AutoFormer search space to identify optimal self-attention architectures for arrhythmia classification. We trained, validated, and tested the resulting model on the PhysioNet Challenge 2021 dataset (n = 88,253), comprising ECGs across three continents. We performed a hyperparameter optimisation on the NAS output, exploring input patch size, class weighting, and loss function. We evaluated performance using the PhysioNet Challenge metric and the area under the receiver operating characteristic curve (AUROC). The NAS converged towards minimal architectural configurations (embedding dimension: 384, depth: 4, self-attention heads: 4, MLP ratio: 1) with a validation challenge metric of 0.66 (PhysioNet Challenge 21 Winner: 0.63). The NAS-created network achieved an AUROC of 0.97 and a challenge metric of 0.71 during testing. Normal Sinus Rhythm and Sinus Tachycardia achieved AUROCs of 0.99. Low-QRS Voltage and T-wave abnormality were the worst-performing arrhythmias, with AUROCs of 0.89 and 0.90, respectively. We interpret that architectural simplicity drives performance in arrhythmia classification. Because SCD is unexpected, prevention strategies in free-living environments require lightweight computational resources suitable for wearable devices. Class imbalance fundamentally limits classification performance for rare arrhythmias such as Low-QRS Voltage and T-wave inversion, irrespective of hyperparameter choices. However, the self-attention mechanism can autonomously abstract clinical representations, simplifying clinical deployment by eliminating the need for an explicit feature-extraction pipeline.

9

Comparative Study on Image Quality of Deep Learning and Adaptive Statistical Iterative Reconstruction-V in Thin Layer CT of liver Lesions

Yang, J.; Li, L.; Cao, J.; Zhang, J.

2026-05-26 radiology and imaging 10.64898/2026.05.23.26353923 medRxiv

Top 3%

0.0%

Show abstract

Objective:This study aims to compare the advantages and disadvantages of DLIR and adaptive statistical iterative reconstruction-V (ASIR-V) in thin-slice (2.5 mm) CT images of hepatic lesions characterized by high and low contrast. Additionally, the study seeks to determine the optimal DLIR strength for the evaluation of liver lesions. Methods:A retrospective analysis was performed on 90 patients who underwent abdominal contrast-enhanced CT scans. Group A comprised 48 patients with low-contrast lesions, while Group B included 42 patients with high-contrast lesions. The acquired images were reconstructed using post-processing DLIR at low (DLIR-L), medium (DLIR-M), and high (DLIR-H) strengths, all with a slice thickness of 2.5 mm (subgroups A1-A3, B1-B3). Furthermore, images were reconstructed with ASIR-V at 50% strength at slice thicknesses of 2.5 mm and 5 mm (subgroups A4/B4 and A5/B5, respectively). CT values and standard deviations (SD) of the liver and lesions were measured, and the corresponding signal-to-noise ratio (SNR) and contrast-to-noise ratio (CNR) were calculated. The edge rise slope (ERS) was determined using ImageJ software by measuring CT values along a line from the liver parenchyma to the lesion. Objective metrics were compared using one-way ANOVA, with independent samples t-tests applied for inter-group differences. Subjective scoring, which encompassed noise level, diagnostic confidence, and lesion margin delineation, was conducted by two radiologists, with differences analyzed using the Kappa test. Results: Objective evaluation revealed a progressive decrease in lesion SD and a progressive increase in SNR and CNR from subgroups A1/B1 to A3/B3. The SD of Group A2 decreased by 57.4% compared to A4, while the SNR and CNR of A2 icreased by 19.3% and 24.6% compared to A4. Although subgroup B2 had a lower SNR than B5, the difference was not statistically significant. SNR and CNR in B2 increased by 24.1% and 11.9%, respectively, compared to B4. ERS gradually decreased from A1/B1 to A3/B3. ERS values in A2 and B2 increased by 27.0% and 39.4%, respectively, relative to A5 and B5. Although A3 had a lower ERS than A1 and A2, all DLIR subgroups exhibited higher ERS than A5; similar trends were observed in Group B. Subjective evaluation indicated good inter-reader agreement (Kappa > 0.61, p < 0.05). As DLIR strength increased, noise scores rose progressively in both groups. However, noise in A2 and B2 was lower than in A4/A5 and B4/B5. Diagnostic confidence and lesion margin delineation scores were highest in A2 and B2, while all subjective scores were lowest in A5 and B5. Discussion: Most prior studies evaluated the liver, vessels, or confirmed that image quality can be guaranteed at low doses. However, there are few studies on specific individual lesions. Therefore, this study aims to investigate specific individual lesions. The details and detection rate were analyzed separately to confirm the clinical acceptability of 2.5-mm DLIR image in different contrast lesions. Conclusion: For both high- and low-contrast hepatic lesions, DLIR provides superior image quality compared to ASIR-V, with the 2.5mm DLIR-M setting being optimal. DLIR-M reduces image noise, improves spatial resolution, and produces images more suitable for diagnostic purposes.

10

Optical coherence tomography as a biomarker for frontotemporal dementia: a systematic review & meta-analysis

Wang, E.; Kohli, A.; Taha, H. B.

2026-05-27 neurology 10.64898/2026.05.19.26353366 medRxiv

Top 3%

0.0%

Show abstract

Background: Frontotemporal dementia (FTD) lacks widely accessible disease-specific biomarkers. Optical coherence tomography (OCT) and OCT angiography (OCTA) may provide non-invasive measures of retinal changes associated with neurodegeneration. We conducted a systematic review and meta-analysis evaluating retinal biomarkers in FTD compared with Alzheimer disease (AD) and controls. Methods: A systematic search of PubMed and Embase was conducted through April 25, 2026 according to PRISMA guidelines. Studies evaluating OCT/OCTA biomarkers in FTD with comparator groups were included. Inverse weighted random-effects models, publication bias assessments, and meta-regressions were performed. Results: Ten studies involving 139 individuals with FTD, 87 with AD, 29 with mild cognitive impairment, 14 with TDP-43 proteinopathy, 5 with tauopathy, and 255 controls were included in the systematic review; five studies were eligible for meta-analysis. Compared with AD, individuals with FTD demonstrated significantly thinner retinal nerve fiber layer (RNFL) thickness (SMD = -0.61, 95% CI -0.98, -0.24). Compared with controls, individuals with FTD exhibited significantly thinner ganglion cell layer-inner plexiform layer (GCL-IPL) thickness (SMD = -0.55, 95% CI -1.02, -0.08), whereas pooled analyses across multiple retinal biomarkers were non-significant (SMD = -0.19, 95% CI -0.52, 0.14). RNFL thickness correlated negatively with female % in FTD and positively with age in both AD and controls. Conclusions: Individuals with FTD exhibit lower RNFL thickness than AD and lower GCL-IPL thickness than controls, suggesting retinal alterations may reflect neurodegeneration. However, larger longitudinal studies with standardized OCT/OCTA protocols are needed to determine the diagnostic and prognostic utility of retinal biomarkers in FTD

11

Vaginal Antisepsis for Major Gynecologic Surgeries Using Chlorhexidine Gluconate versus Povidone Iodine: A Systematic Review and Meta-Analysis

Dias, Y.; Gebrekidan, F.; Lowder, J.; Sutcliffe, S.; Yaeger, L.

2026-05-27 obstetrics and gynecology 10.64898/2026.05.26.26353429 medRxiv

Top 3%

0.0%

Show abstract

ABSTRACT OBJECTIVE: We performed a systematic review and meta-analysis (SRMA) of post-surgical outcomes, comparing chlorhexidine gluconate (CHG) versus povidone iodine (PI) for vaginal antisepsis of major gynecologic procedures. DATA SOURCES: Ovid Medline, Embase, Scopus, Embase, Cochrane, and Clinicaltrials.gov were searched between 1986 and December 2023, for studies comparing CHG with PI for vaginal antisepsis of major gynecologic operations. STUDY ELIGIBILITY CRITERIA: We included Randomized Controlled Trials (RCTs) and non-RCTs comparing CHG to PI for vaginal antisepsis of major gynecologic operations. The primary outcome was surgical site infections (SSIs) and the secondary outcome was urinary tract infections (UTIs) and vaginal irritation. METHODS: Summary estimates were calculated by fixed effects models when I2 [≤] 25% and by random effects models when I2 > 25%. Statistical analysis was performed using RevMan 5.4.1. The protocol for this systematic review was registered on PROSPERO (ID CRD42022378101). RESULTS: Nine studies met the inclusion criteria, four of which were randomized controlled trials (RCTs). 9538 patients were included, 4300 (45%) of whom were allocated to CHG and 5238 (55%) to PI. No statistically significant difference in SSI incidence was found for vaginal antisepsis with CHG versus PI in pooled analyses (n= 9538 patients; RR 1.20; 95% CI 0.92-1.57; I2 =0%). In contrast, a significantly higher risk of UTIs was observed for vaginal antisepsis with CHG than with PI (n=6061 patients; RR 1.48 95% CI 1.03-2.14; I2 = 0%). CONCLUSION: In our SRMA, there were no significant differences in SSI risk when either CHG or PI was utilized for antiseptic vaginal preparation. Interestingly, vaginal antisepsis with PI was associated with a lower incidence of post-operative UTIs following major gynecologic surgery. Our findings support current guidelines that form of vaginal antisepsis can be used for SSI prevention. They also suggest that PI may result in fewer postoperative UTIs but further randomized studies are needed to support these findings. Key words: surgical site infection, surgical wound infection, urinary tract infection, urogynecologic surgery, Chlorhexidine, Povidone Iodine, surgical antiseptic,

12

An ECG foundation model for generalizable cardiac function prediction across the lifespan

Yang, Y.; Peracchio, L.; Mayourian, J.; Miller, T.; La Cava, W.

2026-05-27 health informatics 10.64898/2026.05.26.26354128 medRxiv

Top 3%

0.0%

Show abstract

Background Artificial intelligence-enhanced electrocardiography (AI-ECG) enables scalable, low-cost cardiac dysfunction screening, but existing models are annotation-intensive and predominantly adult-derived, leaving paediatric generalizability uncertain. Paediatric cohorts exhibit highly variable cardiac morphology and function compared to adults, which may be useful for learning generalizable AI-ECG models. Methods We pretrained ECG-Fyler on a predominantly paediatric, all-age cohort at Boston Children's Hospital (1992-2023), annotated with a cardiology-specific coding system (Fyler codes), and evaluated it on assessments from echocardiography (echo) and cardiac magnetic resonance (CMR) studies. We validated on an external adult cohort from Columbia University Irving Medical Center. Performance was benchmarked against several AI-ECG foundation models by AUROC across age groups, lesion types, and limited-data scenarios. Findings The pretraining cohort comprised 782,138 ECGs from 255,271 patients (median age: 10.9 years, IQR: [2.8-16.8]). Internal evaluation included 178,495 ECG-echo pairs (median age: 10.9 [3.7-17.0]) and 8,584 ECG-CMR pairs (median age: 20.7 [15.6-29.6]). External validation included 82,543 ECG-echo pairs from adults (median age: 64.0 [52.0-74.0]). ECG-Fyler improved AUROC across biventricular dysfunction and dilation tasks, with the largest gains in low-data settings. In internal validation, ECG-Fyler detected low left ventricular ejection fraction (LVEF [≤] 40%) from only 100 fine-tuning samples (AUROC: 0.80, 95% CI: [0.78-0.80]), outperforming other models (AUROC < 0.65) and improving with additional fine-tuning (AUROC: 0.94 [0.93-0.94]). Similar improvements were observed for CMR-derived LVEF, RVEF, and ventricular dilation. In external validation on adults, ECG-Fyler exhibited an AUROC of 0.83 (CI: [0.82-0.85]) for LVEF [≤] 40%. After fine-tuning on less than 10% of external data, LVEF [≤] 45% performance (AUROC: 0.87 [0.86-0.88]) outperformed a fully trained, site-specific prior model (AUROC: 0.85 [0.84-0.87]). Interpretation Pretraining on richly annotated, paediatric-dominant ECGs yields models that transfer efficiently across institutions and ages, supporting AI-ECG screening and triage when labels or imaging access are limited. Funding National Institutes of Health (R01LM012973); Kostin Innovation Fund, Boston Children's Hospital

13

Patient Versus Prediction-Level Evaluation of a Dynamic Clinical Prediction Model of Sepsis

Tuttle, M.; Maas, C. C. H. M.; An, J.; Wessler, B. S.; Harvey, W. F.; Selker, H. P.; van Klaveren, D.; Kent, D. M.

2026-05-27 health systems and quality improvement 10.64898/2026.05.26.26354141 medRxiv

Top 3%

0.0%

Show abstract

The Epic Sepsis Model version 2 (ESMv2) is a prediction model embedded into the electronic medical record used to warn clinicians which hospitalized patients are at risk for sepsis. We conducted a retrospective cohort study of 31,951 hospitalizations of 25,760 patients to compare analyses conducted at the commonly used patient-level (where a maximum prediction prior to the onset of sepsis is used to measure performance) vs novel prediction-level (where each prediction is used to measure performance). Sepsis, defined by the Sepsis 3 criteria occurred during 1,049 hospitalizations (3.3%). Patient-level analyses suggested excellent discrimination AUC 0.86; [IQR 0.85, 0.87], whereas prediction-level analyses demonstrated lower performance AUC 0.62; [IQR 0.57, 0.65]. Low estimates of the positive predictive value (14.5% at the patient level vs 4% at the prediction level) imply a high number of false alerts. Common evaluation approaches may overstate the performance of dynamic prediction models and mislead clinical decision-making.

14

Morphological feature remodeling of intracranial arteries in the context of inflammation and HIV-associated cognitive impairment

Hoang, N.; Yang, H.; Uddin, M. N.; Zhong, J.; Faiyaz, A.; Singh, M. V.; Boodoo, Z. D.; Sutton, K. R.; Wang, H. Z.; Sahin, B.; Khan, M. W.; Weber, M. T.; Yuan, C.; Chen, L.; Schifitto, G.

2026-05-27 hiv aids 10.64898/2026.05.19.26353071 medRxiv

Top 3%

0.0%

Show abstract

Background: Despite the success of combination antiretroviral therapy (cART), vascular comorbidities, including cerebrovascular disease, are more prominent in people living with HIV (PLWH) compared to people without HIV (PWOH). However, quantitative assessments of cerebrovascular morphometry and their associations with cognitive outcomes in the context of HIV are still limited. In this study, we explore this missing link. Methods: Magnetic Resonance Angiography (MRA) data, blood markers, and neurocognitive assessments were collected from 73 PWOH subjects (male: 57, female: 16; age: 53 {+/-} 16) and 99 PLWH subjects (male: 66, female: 30, age: 53 {+/-} 11). Vessel morphometric features were quantified using intraCranial Artery Feature Extraction (iCafe) to investigate associations between vessel morphometry, markers of monocytes, endothelial cell activation, and cognitive performance. Results: HIV status predicted a lower total number of branches ({beta} = -0.224, p = 0.001, d = -0.517) and shorter total distal length ({beta} = -0.173, p = 0.021, d = -0.370) with a moderate effect size. Total branch number was found to be negatively associated with plasma levels of monocyte markers (sCD14: r = -0.167, p = 0.033; sCD163: r = -0.157, p = 0.045) and positively correlated with white matter cerebral blood flow (r = 0.550; p [≤] 0.05). HIV status was the strongest predictor of overall cognitive performance in ANCOVA model ({beta} = -0.219, p = 0.006, d = -0.453). Conclusions: Our results suggest that cognitive impairment in PLWH is associated with vessel morphology metrics. Monocyte immune activation may contribute to changes in vessel morphology.

15

Can Large Language Models Diagnose Primary Immunodeficiency from Patient-Described Symptoms?

Reteig, L. C.; Woloshin, S.; Maglione, P. J.; Farmer, J. R.; Ong, M.-S.

2026-05-27 allergy and immunology 10.64898/2026.05.26.26353818 medRxiv

Top 3%

0.0%

Show abstract

Patients with primary immunodeficiency (PID) often face prolonged diagnostic delays and may increasingly turn to large language models (LLMs) to interpret their symptoms during this period. We evaluated whether an LLM could recognize PID from symptom descriptions derived from interviews with 21 PID patients. In a prior study, we showed that GPT-4o identified PID in 96% of cases when prompted with physician-written patient histories (Rider et al., JACI, 2024). Here, when prompted with symptom descriptions in patients' own words, GPT-5 identified PID in only 7 cases (33%), although it more broadly suggested immune system issues in 18 cases (81%). The gap between these findings indicates that LLMs are sensitive to the language and framing of symptom descriptions, performing substantially worse when patients describe their own symptoms in everyday language than when clinicians summarize patient histories in structured medical terms. This study underscores the need to carefully evaluate how LLMs are used in patient-facing applications.

16

ERBB4 deficiency promotes atrial myopathy underlying the atrial fibrillation substrate

Yamaguchi, N.; Santucci, J.; Hong, S. J.; Ferrena, A.; Schlamp, F.; Willett, D.; Casdin, C. J.; Park, P. S.; Lin, X.; Xiao, J.; Hall, S.; Barnard, J.; Achter, J.; Kanhert, K.; Lundby, A.; Chung, M. K.; Van Wagoner, D. R.; Park, D. S.

2026-05-27 cardiovascular medicine 10.64898/2026.05.26.26354173 medRxiv

Top 3%

0.0%

Show abstract

Background Atrial fibrillation (AF) is a leading cause of stroke, cardiovascular morbidity, and mortality. Atrial myopathy, characterized by progressive metabolic, electrical, and structural changes, creates the arrhythmogenic substrate that drives AF. Defining the key drivers of atrial myopathic processes is essential for targeted therapies that can mitigate AF progression. Here we explore how reduced ERBB4 expression contributes to the development of left atrial myopathy. Methods We analyzed the Cleveland Clinic Biobank to compare left atrial ERBB4 levels in patients grouped by AF diagnosis. To investigate the impact of reduced ERBB4 levels on atrial tissue substrate, we created mouse models of cardiac-specific Erbb4 deficiency using Mlc2a (myosin light chain 2a)-Cre. Comprehensive physiological assessments were performed. Transcriptomic analyses of the left atrium were performed in an Erbb4 haploinsufficient mouse model and compared with human atrial datasets. Molecular validation of key dysregulated pathways was performed. Results We found that left atrial ERBB4 levels are reduced in patients with AF. Adult cardiomyocyte-specific Erbb4 heterozygous (Erbb4fl/+;Mlc2a-Cre) mice exhibited prolonged P-wave duration in the absence of ventricular dysfunction. Left atrial transcriptomic analysis in Erbb4 haploinsufficient mice showed upregulation of pathways related to fibrosis, apoptosis, and coagulation, and downregulation of pathways related to fatty acid metabolism and mitochondrial function, mirroring changes observed in pressure overload mouse models. A cross-species transcriptomic comparison revealed significant overlap between ERBB4-correlated gene expression and functional pathways in adult human atria and mice with Erbb4 haploinsufficiency. Validating the transcriptomic data, protein and functional assays demonstrated increased fibrosis, apoptosis, and oxidative stress in the mutant left atrial tissue. Conclusion Left atrial ERBB4 levels are reduced in AF patients. A mouse model of Erbb4 deficiency and human atrial transcriptomic analyses highlight a role for ERBB4 in supporting normal atrial metabolism while protecting against inflammation, apoptosis, and fibrosis.

17

Early Life Determinants of Forward Compression Wave Intensity in Adults

Haynes, A.; Mynard, J. P.; van der Veen, M.; Carson, J.; Green, D. J.

2026-05-27 cardiovascular medicine 10.64898/2026.05.26.26354176 medRxiv

Top 3%

0.0%

Show abstract

Intro: Characteristics of the pulse wave transmitted through the carotid arteries are predictive of cognitive decline and cerebrovascular health in humans. This study aimed to identify risk factor trajectories in childhood, adolescence and early adulthood that are associated with forward compression wave intensity (FCWI) in the common carotid artery in adults aged 28 years. Methods: Systolic blood pressure (SBP), body mass index (BMI) and fasting blood glucose (FBG) measured at multiple time-points when participants were aged between 8-20 years were included in a trajectory analysis. At age 28 years, FCWI was measured in 402 (M=206, F=196) participants who underwent a Duplex ultrasound assessment of the common carotid artery. Statistical analysis assessed differences in FCWI between each trajectory group for males and females separately. Results: In males, four trajectory groups were identified for BMI, three for SBP, and two for FBG. In females, three trajectory groups were identified for BMI, SBP, and FG. In males, having higher BMI (P=0.006), SBP (P=0.021) and FBG (P=0.002) from ages 8-20 years was associated with greater FCWI at age 28 years. In females, no associations were found between FCWI at age 28-years and trajectory groups for BMI (P=0.185), SBP (P=0.289) or FBG (P=0.070). Conclusion: Having high BMI, SBP and FBG throughout childhood, adolescence and early adulthood was associated with higher FCWI in the carotid artery at age 28 years in males, but not females. This may have a direct impact on the etiology of cognitive decline and cerebrovascular disease in later life.

18

Dentine markers of pre/early postnatal lead exposure links with brain, cognitive, and behavioral outcomes in adolescents

Marshall, A. T.; Kan, E.; Adise, S.; König, M.; McConnell, R.; Martinez, M.; Midya, V.; Arora, M.; Sowell, E. R.

2026-05-27 pediatrics 10.64898/2026.05.26.26354134 medRxiv

Top 3%

0.0%

Show abstract

Lead is a toxic metal ubiquitous in our environment. While dramatic reductions in lead sources have paralleled equivalent decreases in lead-poisoning rates, chronic lead exposure remains a critical public health concern. Childhood lead exposure (at its lowest levels) is liked to changes in cognitive development but less is known about lead's effects on children's brain structure, especially as a result of in utero exposure. We measured prenatal and early-postnatal lead exposure in shed deciduous teeth of 448 9- and 10-year-old children (from 20 United States cities) and linked those lead levels to childhood brain structure, cognition/behavior, and neighborhood- and family-level socioeconomic characteristics. Here we show negative associations between tooth-lead levels and the thickness of the brain's cortex, particularly in regions linked to language processing. With increasing tooth-lead levels, children of lower-income (versus higher-income) families showed steeper declines in receptive vocabulary. Caregiver-reported behavioral problems exhibited similar associations. With in utero exposure linked to adverse neurodevelopmental outcomes (well before lead exposure and its risks are evaluated by healthcare professionals), prenatal screening of maternal lead levels/exposure, coupled with recommended strategies to reduce its placental transmission, may help reduce lead's effects on future generations.

19

Data Assimilation Substitutes for Biological Complexity in Hybrid Influenza Forecasting Models

Alleman, T. W.; Van Wesemael, T.; Shanker, N.; Mietchen, M. S.; Loo, S.; Ajagbe, S. O.; Baetens, J. M.; Lemaitre, J.; Hill, A. L.; Truelove, S. A.; Bento, A. I.

2026-05-27 public and global health 10.64898/2026.05.19.26353597 medRxiv

Top 3%

0.0%

Show abstract

Hybrid mechanistic-statistical models offer interpretability and adaptability for short-term seasonal epidemic forecasting, but it remains unclear whether their accuracy depends more on increased biological complexity or on the assimilation of richer data. Using eight retrospective influenza seasons in North Carolina, we evaluate whether training on historical data and assimilating auxiliary emergency department (ED) visit data improves four-week-ahead hospital admission forecasts more than adding biological complexity (multi-subtype structure and cross-season immunity). Hierarchical Bayesian training on historical data improves accuracy by 22.4 % (95 % CI: 16.4-28.1 %), and inclusion of ED visit data yields a further 5.3 % (95 % CI: 3.0-7.6 %) improvement, whereas added biological complexity produces diminishing or null gains. We further observe a substitution effect in which ED visit data partially compensates for omitted biological structure. We deployed a simplified model variant in the 2025-2026 CDC FluSight Challenge and ranked among the top ensemble performers, supporting the robustness of Bayesian hierarchical training in real time. Together, these findings indicate that short-term forecast accuracy is driven more by historical learning and assimilating auxiliary signals than by biological fidelity, with implications for how forecasting systems should balance mechanistic complexity.

20

AI Adoption for NCDs in Kenya: A Qualitative Study

Rayo, J.; Cushny, W.; Mwangi, M.; Wanyee, S.; Linguraru, M. G.; Nyaga, N.; Koros, H.; Bosire, M.; Obuya, M.; Ngaruiya, C.

2026-05-27 public and global health 10.64898/2026.05.26.26354008 medRxiv

Top 3%

0.0%

Show abstract

Background: Non-communicable diseases (NCDs) represent a critical public health challenge in Kenya, responsible for over 50% of inpatient admissions and 40% of deaths. While digital health tools and artificial intelligence offer promising ways to improve prevention, diagnosis, and management, little is known about how these tools are perceived and used in practice. There is limited research exploring the views and lived experiences of young people in Kenya, who are a strategic priority for NCD prevention because behavioral risk factors are established in this window, and for Community Health Providers (CHPs) who provide health services within the community. This study aims to address this gap by examining the perspectives of the burden of non-communicable diseases and the potential role of digital health technologies, including artificial intelligence, for preventing and managing these conditions in these specific populations. Methods: A qualitative research design using focus group discussions (FGDs) was employed in Nairobi (urban) and Busia (rural) counties between March and July 2024. Eight FGDs were conducted with 60 participants purposively sampled from three stakeholder groups: community health promoters (CHPs), healthcare workers (HCWs), and youth aged 18-35 years. A semi-structured guide, co-developed with a Community Advisory Board, explored beliefs about NCDs, health-seeking behaviors, lifestyle practices, and attitudes toward digital health and AI. Audio recordings were transcribed verbatim, translated where necessary, and analyzed thematically using grounded theory principles on NVivo software (v12). Results: Six consolidated themes emerged: (1) understanding of NCDs and perceived risk; (2) barriers to NCD prevention and care; (3) the role of CHPs; (4) adoption of AI tools for NCD management; (5) trust, ethics and access concerns; and (6) community-driven recommendations for AI integration. Significant barriers including stigma, economic constraints, and barriers to care were documented alongside enthusiasm for AI tools among youth and CHPs in both urban and rural areas. Conclusion: This study shows that AI tools are being used for NCD prevention and management through spontaneous community adoption. However, it emphasizes the need for culturally relevant, equitable, and community-driven solutions. Effective scaling requires the identification and bridging of digital literacy gaps, the establishment of affordable infrastructure, the protection of data privacy, and the integration of artificial intelligence tools into existing community health frameworks. This process should involve the collaboration of trusted intermediaries, such as CHPs and community leaders, to ensure successful outcomes. Future initiatives should prioritize participatory design, policy frameworks for ethical governance, and targeted capacity building to enhance acceptance and sustainability of digital health innovations in low- and middle-income country settings.